Robust language recognition via adaptive language factor extraction
نویسندگان
چکیده
This paper presents a technique to adapt an acoustically based language classifier to the background conditions and speaker accents. This adaptation improves language classification on a broad spectrum of TV broadcasts. The core of the system consists of an iVector-based setup in which language and channel variabilities are modeled separately. The subsequent language classifier (the backend) operates on the language factors, i.e. those features in the extracted iVectors that explain the observed language variability. The proposed technique adapts the language variability model to the background conditions and to the speaker accents present in the audio. The effect of the adaptation is evaluated on a 28 hours corpus composed of documentaries and monolingual as well as multilingual broadcast news shows. Consistent improvements in the automatic identification of Flemish (Belgian Dutch), English and French are demonstrated for all broadcast types.
منابع مشابه
Bhattacharyya-based GMM-SVM system with adaptive relevance factor for pair language recognition
In this paper, we develop a hybrid system for pair language recognition using Gaussian mixture model (GMM) supervector connecting to support vector machine (SVM). The adaptation of relevance factor in maximum a posteriori (MAP) adaptation of GMM from universal background model (UBM) is studied. In conventional MAP, relevance factor is empirically given by a constant value. It has been proven th...
متن کاملNoise-resistant Feature Extraction and Model Training for Robust Speech Recognition
In this paper we report on our recent work on noise-robust feature extraction and model training to alleviate the mismatch caused by diierent microphones and ambient room noise in the context of the 1995 DARPA-sponsored H3 benchmark test, which used the unlimited-vocabulary North American Business News (NABN) database. We present a novel noise-robust feature extraction algorithm that is a combi...
متن کاملRobust numeric recognition in spoken language dialogue
This paper addresses the problem of automatic numeric recognition and understanding in spoken language dialogue. We show that accurate numeric understanding in ̄uent unconstrained speech demands maintaining robustness at several dierent levels of system design, including acoustic, language, understanding and dialogue. We describe a robust system for numeric recognition and present algorithms f...
متن کاملMulti-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend
This paper gives an in-depth presentation of the multi-microphone speech recognition system we submitted to the 3rd CHiME speech separation and recognition challenge (CHiME-3) and its extension. The proposed system takes advantage of recurrent neural networks (RNNs) throughout the model from the front-end speech enhancement to the language modeling. Three different types of beamforming are used...
متن کاملA framework for sign gesture recognition using improved genetic algorithm and adaptive filter
Gesture based communication is the standard language utilized by the hard of hearing individuals for correspondence purpose. Despite the way that they precisely chat with each other by a method in sign language, they confront obscurity when they attempt to speak with individuals who can see sound, basically with the individuals who can’t understand sign language. Consequently, an effective meth...
متن کامل